\begin{savequote}[8cm]
The social sciences are granted eternal youth because findings must be
revisited.
\qauthor{Max Weber}
\end{savequote}

\chapter{Online social services: an overview}
\label{ch:social_nets}
%Among the many important innovations brought forward by the
%development of the Internet and, later, of the Web, there is the possibility to
%allow fast and reliable online interactions between individuals all around the
%world. 
Starting from instant chat and bulletin boards systems, and then with
email, newsgroups and discussion forums, people have engaged in online
social interactions since the inception of the Internet. Nonetheless, at the beginning of the 21st century a new
generation of ``social networking services'' came to prominence,
           revolutionising,
within a few years, the
way users access the Web and spend their time online. Nowadays
services like Facebook and Twitter boast hundreds of millions of active users;
in addition, the amount of time spent by users on such sites is soaring.

These online services are already so popular, and their usage so entrenched in our
society, that they have even changed our common language, introducing words such
as \textit{unfollow} (``verb: stop tracking a person, group, or organisation
on a social networking site'') or adding new meanings to existing words
such as \textit{friend} (``verb: add someone to a list of friends or contacts
on a social networking website'')\footnote{The Oxford English Dictionary
introduced these modifications in 2011 and 2010, respectively.}.

In this chapter we provide a broad overview of online social networking
services, discussing their classification and their main properties. We also
review related systems and applications that exploit social information to offer a wide
set of functionalities.  Our main emphasis is on highlighting how spatial and
geographic factors may influence the properties of online social connections,
considering their consequential impact on social-based systems and
applications.
  
\paragraph{Chapter outline}
Section~\ref{sec:osns} introduces a definition of social networking services and
then examines their general taxonomy. The most important properties of the
social networks arising among online users are discussed in
Section~\ref{sec:osn_properties}, which reviews a vast set of related works in the
literature.  In Section~\ref{sec:lbsns} we describe the new
generation of social services mainly based on location sharing and we discuss
the connection they offer between online social interactions and real entities in the
physical world.  Systems and applications related to online social services are
discussed in Section~\ref{sec:soc_systems}. 
Finally, Section~\ref{sec:diss_outlook} draws lessons from the research results we survey,
discussing how this dissertation differs from the growing body of literature,
opening new research directions.

\section{Classification of social networking services}
\label{sec:osns}
The abundance of online services providing social features or supporting social
interactions among users makes it difficult to define properly when a service
can be considered an online social
networking platform. We discuss a definition recently proposed by scholars
and, then, we analyse a classification of these services according to the semantics
of the social connections they support and the related structural directionality.
We conclude this section with a review of recent results focussing on the
characteristics of the most important online social services.

\subsection{Core features of social networking services}
\label{subsec:definition}
As discussed earlier, social interactions were taking place online before the
rise of services such as Facebook and Twitter. For instance, instant messaging
applications were already widely used about 15 years ago to keep a list of
friends and chat with them in real time. However, social networking sites
present a unique set of characteristics that clearly differentiate them from
earlier services.

One concise definition relies on the fact that a social networking
service is a Web-based service that offers these three features to its
users~\cite{bE07:sites}:
\begin{enumerate}
\item the creation of a public or semi-public profile within a bounded system;
\item the ability to create and maintain a list of other users with whom they share a connection;
\item the ability to view and traverse their list of connections and those made by others within
  the system. 
\end{enumerate}

From this definition we note that what identifies social networking services 
is not the fact that they enable online social interaction between
connected members, but, more importantly, that users can carefully craft and
make publicly visible their social connections, and interactions, on the service. This is the
crucial feature that differentiates social networking services from older online
communication tools.

A consequence of this visibility of social ties is that they often represent 
a display of social identity~\cite{Db04:display}. However, there
are deep differences in the structural nature and the social semantics of 
ties supported by different services. These factors are of paramount importance
to understand the characteristics of a given social networking service, its
target user base and the social interactions it supports and fosters;
equally, the applications and systems that each service  enables will be
different. These variations are also likely to impact the spatial properties of
the resulting social ties.

\subsection{Link semantics}
\label{subsec:link_semantics}
As different online social networking services offer different features, and
target different users, the significance of the social connections that their
users establish varies widely. A 
coarse-grained yet effective distinction is
between services that \textit{support} social relationships and services that
\textit{create} connections between people who share interests. For
instance, Facebook users tend to recreate online the set of friendship and
acquaintance ties they have in their daily lives, while LinkedIn supports
work-related connections, mainly to help job-seekers and foster professional
collaboration.  
%Location-based social services such as Foursquare, Gowalla or
%Yelp aid users connect with friends which visit similar physical places.

Other social networks try to entice users with particular interests, such as Last.fm for music-lovers, aNobii for individuals passionate
about books, Flixster for cinephiles and Flickr for amateur and professional
photographers.  Finally, social services such as Twitter and Google Plus
encourage their users to connect to other users, celebrities, media
organisations or product brands, effectively supporting a wide range of link semantics.

\impl The effect of space and distance differs across these services as the
focus of link semantics shifts from supporting existing social relationships
to connecting users who share interests. 
%Evidence suggests that offline social ties are 
%affected by distance, while 
Users connecting because of a common interest in a
given topic might be less likely to be constrained by geographic proximity, as their
tie is fostered by factors less related to the social sphere.
       
\subsection{Link structure}
\label{subsec:link_structure}
At a structural level, the main property of online social platforms is whether
new social connections established by users are unidirectional or bidirectional:
in the latter case, both users must confirm the existence of the social
tie that connects them on the service.

Social services which aim to support trusted relationships, such as
Facebook and LinkedIn, adopt the bidirectional model: users are required to
request, and accept, the creation of friendship connections. In this case there
is only an implicit directionality of the resulting link, as one user has to initiate
the contact and the other has to react. Instead, services
based on shared interests tend to adopt the unidirectional model, where users
can connect with freedom without requiring any reciprocation. In this case the
act of creating a friendship connection is more oriented towards the consumption of
user-generated information: for instance, book reviews, photos, music tracks,
videos or Web links.

This consumer-producer paradigm, where a user generates information and a set
of ``followers'' receive it, has been hugely successful.  Such unidirectional
connection models, more akin to subscriptions than to friendship connections,
are epitomised by services such as Twitter and YouTube, where every user
profile is seen as a channel of updates which is pushed to the subscribers.
Google launched its own social networking service in 2011, Google Plus, with a
unidirectional sharing model that allows users to subscribe to any other
user's updates; at the same time, each user can specify the preferred audience
for every single piece of shared information, with highly granular privacy
control.  Shortly after, Facebook amended its historical preference for mutual
bidirectional connections and allowed users to ``subscribe'' to each other,
providing  sharing filters for users to limit the visibility of each post.

Bidirectional ties are more difficult to establish: malicious users or spammers
can indiscriminately make unidirectional links to several unrelated users,
whereas bidirectional links must be vetted by both parties. Even when
unidirectional links are arising because a user has a genuine interest in
connecting to another user,  there is little
indication that the connection has a social nature, unless the tie is
reciprocated.
On the other hand,
unidirectional ties allow a richer social structure to emerge, since there are
now four possible relationships between any couple of users. Tie asymmetry can
be exploited to infer status or reputation, as the PageRank mechanism used by
Google does to rank Web pages~\cite{PBM99:pagerank}. TunkRank, based on the same
principles of PageRank, has been proposed to rank influential Twitter users
based solely on link structure~\cite{Tun09:tunkrank}.
%TwitterRank~\cite{WLJ10:twitterrank} has been proposed as a  topic-aware
%ranking for Twitter users, again based on a random walker model inspired to PageRank.

\impl We expect spatial constraints to be stronger when online connections
reflect relationships that are as close as possible to social ties in real life:
as a consequence, social services requiring bidirectional ties could be more
affected by geographic distance, whereas services allowing
unidirectional followers could be less constrained by space. 
%We study in Chapter YY how the
%directionality of social ties allowed on online services is related to the
%influence that space and distance have on them.

\subsection{Examples}
\label{subsec:examples}
Even though only in the last decade online social networking has seen an
explosion of interest, social interactions were taking place online much
earlier. In fact, Ward Christensen and Randy Suess created the
Computerised Bulletin Board System, or CBBS, in 1978 to allow computer
enthusiasts to exchange messages in a computerised way, predating modern
social forums\footnote{A brief yet vividly sketched account of online social
networking is The Cartoon History of Social
Networking~\cite{Lon11:socialnets}. Other short reports with similar
material
include~\cite{Nic09:socialnets,Sim09:socialnets,Bia11:socialnets}.}. While CBBS
was used actively by only a handful of users, current online social
platforms have a massive user base. For
instance,  Facebook boasts more than 900 million active users, over 50\% of whom
access the service on any given day~\cite{Fac12:stats}. Similarly, LinkedIn has
more than 135 million members over all the planet, establishing professional
ties with one another~\cite{Lin12:stats}.

Modern online social services have offered researchers the opportunity to
study online social interactions at an unprecedented scale. In addition, since
online services keep track of such interactions in a structured and digital
way, they offer computation-friendly means of recording, storing, accessing and
manipulating the graph of social relationships.  However, the sheer size of
modern social networking services has generated extremely large social graphs,
which pose several analytical and technical challenges. The social connections
between  users, together with any additional available information, are often obtained
through Application Programming Interfaces (APIs) exposed by each service to provide access to their data
in a structured form. Yet, this data acquisition process may be severely
time-consuming, as API calls are typically rate-limited and closely monitored by
the service providers.

At the same time, analytical techniques, algorithms and computational resources
used to study the resulting graphs need to cope with the size of current social
platforms. For instance, one of the largest social systems ever analysed
consisted of about 30 billion instant messaging conversations exchanged over 30
days among 240 million users, gathering about 4.5 terabytes of text logs for
each day~\cite{LH08:msn}. An even larger example is a recent study, which adopted a
methodology based on probabilistic counters to compute graph distances between
Facebook users~\cite{BBR11:four}: the authors analyse a network with about 720
million users and 70 billion friendship connections, discovering that, on
average, Facebook users are only 4 hops apart, with smaller regional networks
exhibiting even lower values.

In the following paragraphs we discuss different kinds of online social
services that have been studied by researchers, ranging from online communities
predating the inception of the Web to modern large-scale social platforms.

\subsubsection{Earlier online communities}
Before the advent of current online social networking services, online services
were neither explicitly recording nor
storing social ties between users. Yet, personal home pages were linking to
each other and users were replying to each other on forums, leaving a digital
trace of these social interactions. Thus, social graphs arising in older online
communities tend to be inferred based on records of such interactions.

One of the first works on the analysis of online social relationships used
email conversations to infer a social graph and study shared interests among
people~\cite{SW93:email}.  ReferralWeb~\cite{KSS97:referral} was a system built
to reconstruct a social network among individuals by mining their online
presence, sifting through Web links across home pages, co-authorship and
citations in academic papers, news archives and organisation charts, with the
aim of exploiting the network structure to locate experts across different topics.
Similarly, social statistics about online interactions in the LambdaMOO
Multi-User Dungeon were collected by an autonomous social software agent,
Cobot~\cite{IKK00:cobot}. Conversations taking place on instant messaging
platforms were used to extract a planetary-scale social network and investigate
homophily effects~\cite{LH08:msn}.

There are other online platforms where social features do not represent the
main focus, such as online shopping and product reviews sites. Such
services also support social interactions between members, usually involving
products in which users share an interest: such social graphs have been exploited to enhance
confidence in sellers, by providing explicit feedback, as well as to provide
useful product recommendations. Several such social networks have been
examined: examples include Epinions~\cite{RD02:epinions},
Amazon~\cite{LAH07:amazon} and Overstock~\cite{SWB08:overstock}.

\subsubsection{Modern online social services}
With the rise of the Web 2.0, loosely defined as the set of online services centred
around consumption and sharing of dynamic user-generated content, the
digital representation of social connections between users became of pivotal
importance, giving birth to the first online services mainly focussed on social
features. As these services accumulated millions and millions of users,
  researchers turned their attention to them. 

One of the earliest studies analysed the social networks arising among the
members of two services, the photo sharing community Flickr and the social
networking site Yahoo!360, finding that each network is segmented into a
well-connected core of users and a fringe of disconnected smaller communities
and isolated nodes~\cite{KNT06:structure}. More studies soon followed,
investigating different social services and offering the first evidence that
online social networks exhibit certain universal features~\cite{MMG07:measurement, AHK07:osn}. Other online communities that
have attracted the attention of researchers include the hugely popular, albeit
now in decline, MySpace~\cite{CW08:myspace,TRW09:myspace} and the social
aggregation service FriendFeed~\cite{GGMC09:friendfeed}.

Some services have received abundant attention due to their sheer popularity
and to their importance, with hundreds of millions of  users accessing them on
a daily basis. We consider a sample of studies for three of the most common
platforms on which people interact: Facebook, Twitter and YouTube.

\paragraph{Facebook} In addition to the basic topological properties, various
social processes have been investigated on Facebook, since it allows users to
interact in a rich manner across several different media (statuses, photos,
likes). Such user interactions reveal a picture very different from the entire
graph of social connections: when social links with lower 
interaction levels are removed the network shrinks and changes structure, thus raising
the challenge to individuate when a tie on Facebook represents a meaningful social
bond between two users~\cite{WBS09:facebook, VMC09:facebook}. Further, since
its inception, user engagement on Facebook is so pervasive that temporal
periodicities connected to human rhythms can be extracted from user
interactions~\cite{GWH07:facebook}. The analysis of information diffusion,
for instance in terms of users sharing Web links received from their friends, 
has been another process largely investigated on Facebook~\cite{SRM09:facebook}.
In particular, a recent work by Bakshy et al.~\cite{BRM12:facebook}  confirms that being exposed to information through online friends
increases the likelihood of re-sharing that information.


\paragraph{Twitter} Launched in 2006 as an SMS-based service, Twitter has been steadily
growing as the main service for publicly sharing status
updates: its public nature, together with the unidirectional nature of its
social connections, has created a unique social ecosystem that has attracted
plentiful of research attention. The motivations that brought early adopters to
use Twitter were discussed in~\cite{JSF07:twitter}, while a classification of
Twitter users was proposed in~\cite{KGA08:twitter}. As a few Twitter users gained
celebrity status by accumulating millions of followers, researchers addressed
the problem of finding the most influential users, adopting different metrics
to quantify the impact of user influence~\cite{CHB10:twitter}.  A large-scale
study of the entire social network of Twitter users found low levels of
reciprocity among users, unlike other social networks, but a considerable
number of news URLs being shared and forwarded over social
connections~\cite{KLP10:twitter}. Based on these findings, the hybrid nature
of Twitter as both a social network and a news media site has been
suggested.

\paragraph{YouTube} Even though YouTube is not strictly a social service, since
it mainly allows user to publish and share their video creations, social
interactions take place on a massive scale: users subscribe to one another's
channels and comment on video items published by others. Video popularity on
YouTube has been extensively studied: one of the central findings is that there
is a strong heterogeneity across videos, with a few of them rising to extremely
high levels of popularity and the vast majority experiencing only a handful of
views~\cite{CKR09:youtube}. Far from being irrelevant, the huge number of
non-popular items still drives user consumption, attracting people interested
in a vast quantity of niche topics: this effect has been popularised by
Anderson as the ``Long Tail''~\cite{And06:longtail} and greatly impacts systems
designed to host and deliver YouTube videos. The temporal dynamics of video
popularity has also attracted research efforts, trying to understand the
factors that drive success for video items~\cite{FBA11:youtube}: in particular,
both social sharing and external influence affect video popularity growth,
giving rise to characteristic patterns in temporal evolution of the number of
daily hits~\cite{CS08:youtube}.

\section{Characteristics of online social networks}
\label{sec:osn_properties}
The graphs arising among the members of online social services exhibit
certain characteristic features, many of which are equally found in other types
of social networks. Some of these properties have even entered common knowledge
and popular culture. Many of these features are not unique to
social networks, but are commonly found in other networked systems.

We present in this section a review of the most important features observed in
social graphs, arising in online services or elsewhere. Our intention is to
introduce the characteristics that are traditionally associated with social
systems and to discuss which properties are likely to be influenced by the
spatial constraints imposed by geographic distance.

\subsection{Node properties}
\label{subsec:node_properties}
A first observation about social graphs is that they tend to be \textit{sparse}:
a large fraction of node pairs are not connected, with the result that the
number of links is comparable to the number of nodes. The degree of a node, that
is, the number of connections it has to other nodes, is of particular interest
when analysing complex networks. When each node has the same topological
properties, one would expect a homogeneous degree distribution to arise, where every
node degree is tightly distributed around a well-defined average value. In these
scenarios nodes are almost interchangeable, lacking any distinguishable
individual  feature. Instead, real networked systems often exhibit a much
broader distribution, with a noticeable positive skewness or, in other common
terms, with a \textit{heavy tail}. Such distributions have been found, among
other examples, both in the network of Web pages~\cite{HA99:web} and in the
graph of connections between Autonomous Systems on the Internet~\cite{FFF99:internet}.

This broad degree distribution has two important consequences: first, individual nodes are
different from a topological perspective, as there exist
huge variations between them; second, the co-existence of a few highly connected
nodes and a large number of poorly connected elements has important consequences
for the overall network structure and for processes such as information
diffusion~\cite{Rap53:information}, epidemic spreading~\cite{PSV01:epidemic} and
attack tolerance~\cite{AJB00:attack}. In online services, where connections can
easily be established and then accumulated, this heterogeneity can be even more
marked~\cite{KNT06:structure,MMG07:measurement}.

Another remarkable feature of social graphs is that they tend to exhibit positive
correlations between node degrees~\cite{New02:mixing}. In other words,
individuals with more connections tend to connect to other
individuals with many connections: conversely, individuals with
fewer ties connect among themselves. In contrast, other complex
networks with a similar degree distribution exhibit negative
correlation, with high degree nodes preferentially connecting to
low degree ones~\cite{YJB02:internet,BBP04:strength}.  Positive assortativity, together with
transitivity (Section~\ref{subsec:clustering}), are the two main
structural features that differentiate social networks from
other types of networks: their presence seems to be intimately connected
to the tendency of individuals to cluster in groups or
communities (Section~\ref{subsec:communities})~\cite{NP03:socialnets}.
As a result, these structural properties have important effects on
several dynamic processes that take place on social networks.

\impl The heterogeneity observed across nodes in social networks could
translate to a similar heterogeneity with respect to spatial properties. In
particular, popular users with large numbers of connections might
tend to be \textit{less} affected by spatial distance than users
with fewer connections. Given the importance of such highly connected
individuals for the  processes taking place over the social network, the
spatial implications would become significant.

\subsection{Transitivity}
\label{subsec:clustering}
Sociologists have often discussed the fact that social connections are \textit{transitive},
in the sense that friends of friends are also likely to be
friends~\cite{WF94:sna}. Several theories have been put forward to understand
what drives this behaviour: Heider was the first to suggest that people tend to
seek \textit{structural balance} in their relationships~\cite{Hei46:balance}.
The idea of balance focusses on the concern that individuals have about how
their personal attitudes and opinions coincide with the attitudes and opinions
of their friends. Thus, if John is friends with Isaac, and Isaac is friends with
Kim, then if John is not friends with Kim there will be a perceived
imbalance\footnote{More precisely, Heider's structural balance theory pertains to the
perceptual level, thus the perceived existence of a social relationship is far
more important than its real existence.}. Even though the original theory was
proposed to explain the more general case of directed relationships, structural balance has been an inspiration
to many triadic closure models that aim to mimic the formation of new social ties~\cite{BC96:models}.

A crucial property of social networks is that they exhibit high levels of
transitivity: high transitivity is not common in other networked systems such as
biological and technological ones. In fact, one would expect a certain number
of existing triads in a network arising merely because of a certain level of
edge density: indeed, such a value can be calculated in a straightforward way
in simple random null models, where probabilities of connections between nodes
can be theoretically manipulated~\cite{EMB02:email}. Hence, transitivity is
often quantitatively measured by counting the number of triangles, or
triads, in the social graph, and checking whether this value is larger than one
would expect in a random graph of comparable size and density~\cite{NWS02:transitivity}. While in several
non-social systems the amount of transitivity present can be explained by
simple random models, thus suggesting the absence of significant mechanisms
influencing local network structure, the same is not true for social networks.
In general, social graphs exhibit a far higher degree of transitivity than
their random counterparts~\cite{NP03:socialnets}. 

From another point of view, the abundant presence of such triangles might be seen as
evidence of tightly connected local neighbourhoods or groups. Each social tie
belonging to a triangle is a short-range connection between two nodes that are
already close to each other, since they share at least one common neighbour.
In complex networks literature, this property has also been referred to as
\textit{clustering}: in particular, the clustering coefficient of a single node is
defined by taking into account the number of triangles present in its
neighbourhood with respect to the total number of possible
triangles~\cite{WS98:smallworld}. Again, social networks tend to have higher
values of the average clustering coefficient than random graphs of comparable size.

Statistical analysis of the evolution of large-scale online social systems
suggests that the likelihood that two individuals will establish a social
connection greatly increases as they have more friends in
common~\cite{LBKT08:microscopic} or as they are affiliated to more common
groups or communities~\cite{KW06:evolution}, confirming the transitive nature
of online social ties. This has an important impact on link prediction and
recommendation systems, which aim to find pairs of disconnected users who
are likely to establish a friendship tie.

\impl Transitivity appears to be  a strong influence behind the creation of social
ties and its importance for online services is unquestionable. Since spatial
distance affects the likelihood of connection between individuals, one would
expect that triads tend to arise between spatially close individuals. However,
whether this process is purely driven by spatial factors, by social ones, or by
a combination of them is open to discussion.

\subsection{The small-world effect and navigability}
\label{subsec:smallworld}
The rich local structure present in social networks, which implies a large
number of short-range connections, would suggest that, on average, the path
length between two random nodes can be large. For instance, a regular
lattice possesses an ordered local structure, with edges between nodes which
are close to each other, but longer paths are needed to connect nodes that are
further away.  However, social networks tend to exhibit short distances between
individuals, as found in random graphs that lack clustering.

This concept has been popularised to the general public by 
Stanley Milgram's famous experiment about social chains of
acquaintances~\cite{TM69:sixdegrees}, being termed as
the ``small-world effect''. Milgram's experiment took place in 1967: he asked 160
randomly selected individuals living in the US town of Omaha, Nebraska, to
deliver a package to a specified person living in Boston, Massachusetts, about
1,500 miles away. Participants merely received a limited set of information
about the target: name, address and occupation. However, participants were not 
allowed  to mail the package directly; they had to rely, instead, on their own
friends and acquaintances, forwarding the package to someone they felt could get it
``closer'', in any sense,  to the final destination. These friends or
acquaintances were provided with the same instructions, until the package was
eventually delivered to the target in Boston. For each transition a postcard was
sent back to Milgram with the details of the receiving party.

Even though the majority of these delivery chains did not reach the target, 64
packages were successfully dispatched to the final destination. Among these,
the average path length was close to 6 hops, which led to the popular culture reference about
the ``six degrees of separation'' between any couple of individuals on the
planet\footnote{Milgram never used this exact phrase in his original works;
rather, it was later made popular by the Broadway play and, later, by the movie
bearing the same name~\cite{Gua90:six}}. 

The apparent incongruity between clustering at a local
level, which could lead to long paths as in regular lattices, and the global
property of short connection paths was unravelled by Watts and Strogatz in their
groundbreaking work~\cite{WS98:smallworld}. A regular lattice structure can be
slightly modified by randomly rewiring a few edges, creating long-range
shortcuts; even a negligible number of rewired edges  dramatically decreases
the average path length, while the local clustering remains almost unchanged.
The conceptual implication of the Watts-Strogatz model is that the small-world
effect in real networks represents a middle point between a totally ordered and
a completely disordered system. Further, the powerful practical implication is
that any information or signal can quickly propagate through small-world
networks.

The existence of such short paths on the network does not imply that they can
be found through decentralised mechanisms with only limited local knowledge. In
fact, participants in Milgram's experiment struggled to locate the target in
the majority of cases, breaking the chain before completion. The problem of
navigability in a social network was initially discussed by Kleinberg
considering a simple graph model where long-range connections are added to a
regular lattice with probability proportional to \ensuremath{r^{-\alpha}},
where \emt{r} is the Manhattan distance between nodes~\cite{Kle00:navigation}. In
this model, a simple decentralised greedy search strategy is adopted to forward
messages to a target: such a strategy is only successful,
with an expected path length \ensuremath{O\left(\log^2 N\right)} in the number of
nodes \emt{N}, only if \ensuremath{\alpha = d}, where \ensuremath{d} is the
dimensionality of the lattice.  Any other value of \ensuremath{\alpha} results
in asymptotically longer paths. This suggests that the correlation between
long-range connections and local network structure provides important cues to
navigate the network.

Both in Milgram's experiment and in similar more recent reenactments the path
followed by the forwarded message moves geographically closer to the
target~\cite{DMW03:search}. Together with information about occupation,
geographic cues are predominantly used to select the next step in the chain,
particularly in early steps~\cite{KB78:reverse}. The interplay between
friendship and geography with respect to the navigability of real-world social
networks has been discussed in a work by Liben-Nowell et
al.~\cite{LNKR05:routing}. Through simulation of the message-forwarding
experiment on a large-scale online social network in the USA, a purely
geographic greedy search strategy is shown to achieve results comparable to the
original experiment by Milgram. Yet, the social network exhibits
long-range connections spanning geographic distance \emt{d} with probability
proportional to \ensuremath{d^{-\alpha}}, with \ensuremath{\alpha \approx 1}.
The apparent contradiction between the navigability of the network, which would
require \emt{\alpha \approx 2} as the Earth's surface is
two-dimensional, and the spatial distribution of connections is then resolved
by considering the large variance in population density and adopting
a novel geographic notion, \textit{rank-distance}, to control for heterogeneous
population density and hence reconcile the two results.

\impl As demonstrated by Liben-Nowell et al.~\cite{LNKR05:routing}, and also as
suggested by Backstrom et al.~\cite{BBR11:four}, geography plays a huge r\^{o}le in
forming the global structure of social networks, with important consequences for
the navigability of the network itself. However, the relationship between the
topological position of social connections and their
geographic properties is still unknown.
%            which signals whether they are
%embedded in a community rather than bridging separate groups, 

\subsection{Community structures}
\label{subsec:communities}
Individuals tend to organise themselves into social groups at different scales:
families, working and friendship groups, villages, towns and nations. The
properties of these  groups have been studied for a long time, as they
constitute the building blocks of human society~\cite{Col64:sociology}.
Researchers in the social sciences have often investigated the emergence of
structurally cohesive groups of individuals  and the tendency of the members of such groups to exhibit high levels of
homogeneity~\cite{WF94:sna}.

From a structural point of view, the presence of such groups causes the social
network to exhibit specific patterns.  More precisely, the structure of social
ties between people presents strong local inhomogeneities, with a higher
concentration of connections within social groups and fewer links between them.
This feature of networks has been termed \textit{community structure} and has
been observed not only in social networks but in numerous other networked
systems: for instance, links among Web pages reveal communities which likely
correspond to clusters of sites related to the same topic~\cite{DGP07:web}.

Yet, even though an abundance of measures have been proposed~\cite{WF94:sna,
GN02:community, FLG02:community, NG04:algorithm, RCC04:community}, no clear
consensus has been reached on a quantitative definition of community. Several
algorithms have been devised to partition a network into
communities~\cite{New04:community,DJA05:community, For10:community}: each
method relies on a given assumption about which properties a community should
exhibit and most of them are computationally intensive and unable to scale to
large networks. A fast and widely adopted algorithm is the Louvain
method~\cite{BGL08:fast}, based on the optimisation of the modularity measure
proposed by Newman and Girvan~\cite{NG04:algorithm} and able to scale to large
networks. 

Finally, Newman and Park suggested that both transitivity of friendship and
assortativity can be captured by a simple model where social connections arise
from users' affiliations to one or more groups or
communities~\cite{NP03:socialnets}. This suggests that the existence of social
groups has a strong impact on the properties of social networks.

\impl  The existence of communities in social networks reflects the tendency of
individuals to form groups. These groups tend to be constrained by geography,
with smaller communities being spread over a smaller
area~\cite{OAG11:geogroups}. Since people close together are more likely to be
connected, dense communities might arise merely because of geographic
proximity. Researchers have therefore proposed community detection
methods that control for spatial proximity to extract more meaningful
communities~\cite{EEBL11:expert}. Since the relationship between social and
spatial factors could be more complex than the one captured by controlling for
the distance-dependent probability of connection, a better understanding of spatial
patterns of social connections could greatly benefit our understanding of
communities.

\subsection{Homophily, social influence and information diffusion}
\label{subsec:homophily}
Homophily is the tendency of people to establish links with other similar and
like-minded individuals. Homophily is a powerful driver of social link
formation~\cite{MSLC01:homophily}; several studies have shown that there are
important connections between rising similarity between individuals and
potential social connections~\cite{CCH08:feedback}. Many systems to predict social link
formation are based on the assumption that similar users might want to connect
with each other~\cite{LK07:prediction}. However, homophily can also lead to
segregation. Schelling demonstrated that, even in an uncomplicated model, global
patterns of spatial segregations can arise from homophily sought by agents at
a local level, even if no individual actively seeks
segregation~\cite{Sche69:segregation}. Local homophily can thus easily
partition a social system into spatially segregated components.

While similarity tends to foster new social links, existing ties can cause
individuals to become more similar by fostering the spreading of trends and
innovations~\cite{Rog95:diffusion}. A few studies have investigated the
influence that friends can have on product adoption~\cite{SS98:influence}. Such
viral adoption, that is, behaviour spreading due to
person-to-person recommendation, has been intensively researched, trying to
understand its dynamics and control its evolution~\cite{LAH07:amazon}.
However, other results suggest that evidence of viral spreading could be
greatly overestimated, since a large fraction of such social pressure could also be
explained by the fact that connected users are likely to share interests,
precisely because of homophily effects~\cite{AMS09:influence}. 

Intimately related to social influence, information diffusion is another
important process taking place on online social services.  Pieces of information
or content items are thought to face lower barriers than products and
behaviours when spreading over social connections, quickly and widely reaching a
large portion of the network~\cite{GGL04:blogspace}.  Such an information
dissemination process over the network one hop at a time is often referred to as
a \textit{social cascade}~\cite{Bik97:fads}.  Social cascades have been
investigated in sociology, economics and marketing for more than 60 years: an
eminent example is the threshold model proposed by Granovetter, which models
information propagation as a local process depending on friends'
adoption~\cite{Gra87:threshold}. 

More recently, researchers have harnessed
large-scale datasets to track and study social cascades in online services. Adar
and Adamic studied the diffusion of information in blogs by adopting epidemic
models of spreading~\cite{AA05:blogspace}. Another large-scale characterisation
of information cascades using data from Flickr was presented by Cha et
al.~\cite{CMG09:flickr}: their findings show that information does not spread
widely through the network, but remains close to the initial seed. A more recent
study on URL diffusion over Twitter also supports this claim~\cite{RBC11:urls},
      since the authors find that cascades extend only over  a few additional hops
      beyond the initiator.

\impl Overall, there appears to be a strong connection between the behaviour of
similar individuals and their social connections. This has important spatial
implications when social connections tend to be constrained by geographic
distance, as viral spreading can be similarly spatially limited. Equally, other
dynamic processes taking place on social networks could have a strong spatial
component.

\subsection{Temporal evolution}
\label{subsec:growth}
Networks are dynamic entities: they evolve over time, as their edges are
rearranged and altered, but they equally change when new nodes and connections
are added and deleted. One of the first attempts to describe the growth of real networks
over time is the \textit{preferential attachment} model, put forward by
Barab\'{a}si and Albert to explain the emergence of a scale-free network with
power-law degree distribution between Web pages~\cite{BA99:ba}. This model
relies on a generic growth mechanism where new nodes with constant degree are
continuously  added to the network and their new connections preferentially
attach to already well-connected nodes. Further analysis of the preferential
attachment model revealed that it results in a degree distribution where
the probability of nodes having \emt{k} connections is given by \ensuremath{P(k)
  \propto k^{-\gamma}},
  with \ensuremath{\gamma = 3}~\cite{BRS01:scalefree}. At the same
time, several modified versions of the preferential attachment model have been
put forward to describe patterns arising in other complex networks; a
thorough review is given by Boccaletti et
al.~\cite{BLMC06:report}.

The preferential attachment model reproduces the stationary properties of
scale-free networks, but it also imposes two important constraints on network
growth: the average node degree remains constant over time and the network
diameter is a slowly growing function of the number of nodes. However, in 
an extensive study of several networks spanning different domains,  Leskovec
et al.~\cite{LKF06:densification} found that real networks tend to exhibit
increasing average degree and decreasing diameter: graphs appear to be 
\textit{densifying} and \textit{shrinking} at the same time. Hence, they
propose two new families of growth models based on
community-driven node connections and on a copying mechanism of neighbours'
links through an epidemic spreading process.

While online social networks exhibit scale-free structure, they also present
high levels of clustering. This appears in contrast with the preferential
attachment model, which predicts vanishing clustering as the system size grows.
This discrepancy has been addressed by a tunable model that adds a triad
formation step to the original mechanisms~\cite{HK02:tunable}. The final result
is that the scale-free degree distribution is maintained but the level of
clustering can be increased up to what is found in existing networks. This
modification suggests that a triangle closing mechanism could be as
important as the preferential attachment principle in driving network growth:
both these processes mimic how social networks could grow, namely by means of
users connecting to popular individuals and to friends of friends.

The evolution of online social networks at a microscopic level
has been quantitatively studied by Leskovec et al.~\cite{LBKT08:microscopic}.
Their findings suggest that the three main processes driving network growth are
a preferential attachment mechanism for new nodes joining the network,
inter-edge waiting times distributed according to a truncated power-law and a
simple triangle closing mechanism to provide clustering.

\impl The two main processes driving the growth of social networks seem to be
a global attachment process that favours popular nodes, and a local
clustering mechanism that results in triads, communities and clusters. The
interplay between these two forces and geographic factors is yet to be 
understood fully, in order to devise evolutionary models that describe the spatial
properties as well as the social characteristics of social networks.

\section{Location-based social services}
\label{sec:lbsns}
The emergence of online social networking services has revolutionised the Web,
    driving the transition from a static, one-way information transfer to a
    dynamic, user-generated and interactive communication.  
    Another important trend we have seen is the growing popularity of
    powerful mobile devices, making available accurate and cheap
    location-sensing technologies to mobile users.
%Web has caused a substantial shift in the feasibility of pervasive services:
%since modern devices offer location-sensing capabilities, the ability to share
%where the user is, to generate location-tagged information and to search for it
%makes the deployment of location-based services a common reality.

The combination of these two powerful trends has recently resulted in
an innovative generation of services: location-based social
networks. These platforms combine geographic services, such as
geocoding and geotagging, with online social interactions, enabling users to
generate and share information about the locations they visit. In this section
we briefly summarise the timeline of location-based social services; we then
focus on the concept of place as an integral part of the mobile
experience offered to users by these services. Finally, we discuss the influence
that these novel location-sharing features might have on online user behaviour. 

\subsection{A brief history of location-based social services}
\label{subsec:lbsn_history}
The first attempts at building location-based services with explicit support
for user social interactions were mainly research experimental designs,
exploring the implications and consequences of these services in a controlled
and monitored environment. Paulos and Goodman explored how ``familiar
strangers'' 
whom people were meeting repeatedly
in public spaces could be approached through a
mobile application~\cite{PG04:stranger}. Similarly, Eagle and Pentland designed
and built the Serendipity mobile system, enabling users to detect and identify
proximate people~\cite{EP05:mobilizing}. The semantic concept of place was
explored by Wang and Canny~\cite{WC06:place}, studying user reactions to an
idea of location going beyond the raw geographic coordinates
often used in previous location-based services. 

%\todo{Mention Flickr and geotagged photos}

The first large-scale commercial location-based social service to gain
appreciable traction among users was Dodgeball. Created in 2000, Dodgeball was
a mobile application to distribute location-based information to 
friends, in order to facilitate social gatherings within
cities~\cite{Hum07:dodgeball}. Among the many innovations introduced by the
service, Dodgeball pioneered the concept of  a ``check-in'' as used by today's location-based social services,
in the form of a  SMS text message sent to the central server with the indication
of the user's current location; this information was then sent to the
user's friends, again via SMS messages.

Dodgeball was bought by Google in 2005, which later discontinued the service to
make way for their own location-based system, Google Latitude. However, other
location-based social services soon followed. Brightkite was founded in 2007 as
a social networking website that allowed users to share their location with
their friends, providing a crowdsourced database of places that users could
access and modify. Similar features were offered by Gowalla, another
location-based social network created in 2009; this  again supported check-ins,
  which were shared with friends on the service. Also in 2009, the original
creators of Dodgeball launched Foursquare, an innovative location-based social
network that added game mechanics on top of traditional place check-ins; the
user with the highest number of check-ins in the last 60 days is deemed the
\textit{mayor} of a place,  encouraging location sharing as users compete to win such ``mayorships''.
  Foursquare has since
overshadowed the other services, accruing millions of users and becoming the
most popular location-based social service available. It now allows users to
leave tips about places, create lists of places to visit; it
also features a sophisticated place recommendation system. 

The landscape of location-based social services includes many other examples. 
  The most popular online social services have launched location-based features, enabling users to specify
a particular location when they share an item, a status or a photo. 
For instance, Twitter offers the option of tagging
each status update with a geographic location. Facebook has launched a similar
feature, allowing its users to specify a location or a place when posting 
updates. 

These developments are likely to bring location-based features to the vast
majority of users, expanding the initial nucleus of eager early adopters. Other services
mainly used on mobile devices, such as the popular photo sharing application
Instagram or the local businesses directory Yelp, take advantage of a large
catalogue of venues to facilitate the creation and retrieval of information
related to physical places.  Overall, the main technological trend seems to
be the convergence of social, mobile and local applications towards a
unique user experience, with the potential to bridge the gap between online
information and the physical world.

\subsection{Features of location-based social networks}
There are many location-based services available to
mobile users, with purposes ranging from searching for local businesses and locating the user
on a map, to recommending interesting places to tourists. 
As online social networks become increasingly location-aware, the line between
purely location-based services and more general online social platforms is blurring. 

This dissertation focusses mainly on the interplay between spatial factors and
online social services; hence, we concentrate on location-based
services that feature both a strong social component and an engaging experience
revolving around physical places. Our aim is not a complete
classification of location-based social services,  and we focus our interest on
services based on these key concepts:
\begin{itemize}
\item users establish social connections among them and can interact with each
other, publicly sharing their friend lists;
\item users are able to access, search and modify a database of
\textit{places} and their related information;
\item users mainly interact with the service through a mobile device, which is
able to personalise the service based on the user's current location;
\item users can voluntarily disclose to the service the exact place where they
are, through an action referred to as a \textit{``check-in''}: such
information can then be made public or shared only with friends.
\end{itemize}

As individuals use these location-based social services they leave
behind them digital traces of both their social interactions and their spatial
movements.  These are enriched with data about the 
nature of the places visited by each user, adding an additional layer of
information with rich semantic implications.

   Location-based social networks represent the ideal
systems to investigate the spatial properties of online social services.  First
of all, they uniquely and simultaneously provide data about both social connections
and geographic movements, making detailed spatial analysis possible.  In
addition, user location information in these services is often more accurate
than text-based descriptions available in other online
systems~\cite{HHS11:bieber}, since it is directly acquired through
location-sensing mobile devices.  Finally, to this date, these services have
already accumulated hundreds of thousands, and sometimes millions, of users,
thus enabling large-scale studies that can uncover general properties
and trends.

\subsection{The importance of places as  online entities}
The main innovation introduced by location-based services is that user activity
takes place in the physical world, and not in intangible Web space.  Thus,
in order to be represented in the online setting, real locations have to
be mapped to a new virtual counterpart, the \textit{place}.

% location vs place vs poi
These places, also called \textit{venues} in the context of such services,
represent references to geographically placed physical entities. 
The concept of place on location-based services is strongly
similar to the concept of \textit{Point Of Interest} (POI), widely used in
cartography to signal that a set of coordinates on a map is relevant or
interesting with respect to a given context. In our scenario, a place is a
human-defined POI; the difference between a location and a place is
subtle but important.  Whereas a location is a geographical construct
immutable over time, like a fixed point on the Earth's surface, a place is
a POI loosely coupled with a location. In other words, in theory a place can move to
another location and still be the same place, maintaining its semantic
implications for users~\cite{Gal10:poi}. 

% metadata and applications
The importance of representing places on the Web is a theme that goes beyond
location-based platforms. In fact, 
  there is a potentially vast set of additional data that can be added to the
  virtual representation of a place,
such as an address, a name, a description, category or type information,
contact details, and so on.  As online services start offering the ability
to create and search for information related to physical places, new
applications and systems will be designed and implemented to take advantage of
these new layers of information. To achieve this, new standards
and methods will be required to harmonise how places are referred to and
represented on the Web; the importance of this aspect is confirmed by the
existence of the W3C Points Of Interest Working Group, launched in 2010 with a
``mission to develop technical specifications for representation of POI
information on the Web''\footnote{More information is available online: 
\url{http://www.w3.org/2010/POI/}.}. As more and more places are represented
on the Web, and as providers exploit this vast catalogue to build
location-based services, the connection between the online and the
physical worlds will become stronger and more interesting.


\subsection{The impact of location sharing}
The landscape of location-based social services appears exciting and
highly dynamic: given their infancy,  there are some important factors that are influencing user
adoption and, as a consequence, the characteristics of user activity.

Since these platforms are still in a relatively early stage, they tend to
attract mainly enthusiastic early adopters; as discussed by Rogers, these
users are eager to try new innovations and tend to be young, to have high social
status, to be highly educated and to be socially
open~\cite{Rog95:diffusion}. This means that the user audience of these
services is far from being representative of the bulk of Web users, and even
more different from the overall composition of the society.

The act of sharing one's location raises important privacy concerns; users are
increasingly aware that information about their whereabouts can be highly
sensitive and easily misused. For instance, in 2010 a Web service called ``Please
rob me''~\furl{pleaserobme.com/} was set up to extract automatically 
Foursquare check-ins publicly shared on Twitter, raising users' awareness about
giving away the type of information a burglar would love to have.  In addition,
privacy groups argue that the privacy policies of online companies collecting
location data are ``uneven at best and inadequate at
worst''~\cite{Mor10:privacy}. Overall, when users are concerned about sharing
their location they could react by selectively avoiding to disclose  when they are
at certain places or by entirely refusing to join a location-based service. 

Finally, these services need to be accessed via applications available
only on mobile devices, because they take advantage of location-sensing
technology to help the user navigate nearby venues. While smartphones are
quickly widening their user base, the considerations about the particular
demographics of early adopters equally apply in this case. At the same time,
             affluent people living in cities are known to be more eager to try
             online social networking services\footnote{This was true as early
               as 2009 for large-scale services, as presented by a Nielsen
                 report (available online at
                     \url{http://blog.nielsen.com/nielsenwire/online_mobile/the-more-affluent-and-more-urban-are-more-likely-to-use-social-networks/}).}.
This bias could be of greater importance for location-based platforms, as users
located in cities with a high density of places are more likely to be enticed to
join than users living in less populated and sparse areas that offer less
spatial variety. In addition, the
potential interest in discovering new venues is higher in dense urban
environments than in smaller settings, where users may already be familiar with
most of their surroundings.

\section{Social-based systems and applications}
\label{sec:soc_systems}
The study of the properties of online social services, which has largely drawn
from related findings in physics and social sciences, has several practical
applications. In particular, as users spend more and more time using online social
platforms, a better understanding of the structural properties of the resulting
social graph is needed to design architectures and services appropriately.

In this section we discuss some examples of systems related to online social
services, emphasising  how spatial information could be exploited to improve
existing design choices or to create new applications. 
The mechanisms and factors that drive the temporal evolution and growth of
the social network are of paramount importance to design link
prediction systems. At the same time, the diffusion of information and
trends across social ties and the importance of homophily  greatly
impact the design of recommender engines. Finally, since online content
consumption is increasingly driven by social sharing, a wide range of
storage and delivery infrastructure solutions could largely benefit
from information extracted from the social connections among users.

\subsection{Link prediction systems}
\label{subsec:link_prediction}
Social networks are highly dynamic, since they grow and change over time
with the addition of new edges as individuals engage in new interactions or lose
contact with old acquaintances.  Online social services equally exhibit  
temporal  evolution, primarily because new users join and establish new
connections from scratch, but also because existing members form new
relationships. By understanding the mechanisms by which social
networks evolve it becomes feasible to predict which social relationships are
likely to appear in the future.

More generally, the problem of \textit{link prediction} is an important task
in network science, with important applications in every field that makes use
of network models to represent real systems.  Despite this, the
link prediction problem was initially formulated explicitly for social networks
by Liben-Nowell and Kleinberg~\cite{LK07:prediction} as the task of accurately
predicting the edges that will be added to the network during a future temporal
interval, using a snapshot of the network at current time. In this
formulation there is emphasis on predictive power rather than
on  capturing global structural
features, such as the degree distribution or the level of clustering. 
Initial results demonstrated that different proximity measures, based on
information entirely contained in the social network itself, can be
effectively exploited
to algorithmically predict which new ties will occur.

   Link prediction methods
enjoy increasing popularity thanks to their importance for online social
services. Such services strive to entice and retain their users by offering them
a pleasant experience, which often involves a rich set of social interactions.
Since users with more friendship connections benefit more from the service
themselves, systems to recommend and suggest the creation of new online ties are put
in place. Such recommending engines largely draw from models that predict
which social connections are more likely to develop. Since Facebook launched the ``People You May Know''
feature~\cite{Rat08:friends}, devoting vast engineering efforts to finding other
members that users might want to add as Facebook friends, it has become
customary to deploy such systems on social platforms. 

It is interesting to note that these approaches only focus on finding suitable
recommendations in a subset of the prediction space: namely, Facebook considers
only pairs of users who already share at least one friend. This is due to the
fact that in a large graph the total number of node pairs  grows quadratically
with the number of nodes. Facebook, with its 900 million users, would face a
prohibitively large  task if it searched the entire prediction space for useful
predictions.

More recently, researchers have advocated supervised approaches to link
prediction, given the possibility of modelling the task as a binary
classification problem. In particular, Lichtenwalter et
al.~\cite{LLC10:prediction} have presented a detailed analysis of challenges in
link prediction systems, discussing the extreme imbalance inherent in link
prediction tasks, where the number of positive instances is overwhelmed by the
number of negative cases. They propose to mitigate these
problems by treating prediction separately for different classes of potential
friends.

\impl  Information about spatial movements of users across places
can reveal a great deal about their preferences and interests, thus
improving models of link prediction based on user similarities. In addition,
spatial distance could be used as a filtering mechanism to select
promising prediction candidates on which more complex models could be
run.

\subsection{Recommender engines}
\label{subsec:recsys}
As soon as the Web allowed users to access unprecedented amounts of
information, recommender systems emerged to help individuals navigate such
large collections of data according to their personal interests. These systems
are designed and built around the principle of \textit{collaborative
filtering}, the central hypothesis of which is that content items should be ranked
``based on the premise that people looking for information should be able to
make use of what others have already found and evaluated''~\cite{ME95:cf}. The
key insight here is that the collective set of user preferences can be used to
help each individual.  Similarity between users is computed according to
their item preferences. Then, user preferences that are unknown for certain items
can be predicted by considering ratings by similar users.

Social connections have mainly been introduced in recommender engines as trust
relationships; in this context, trust is considered to be the level of belief
established between two individuals~\cite{JIB07:trust}. In a much broader
sense, social trust can be described as the willingness to take some action as
the result of receiving information from a given producer~\cite{Gol08:trust}.
Hence, the main challenge is to estimate the level of trust between users.
Trust-based connections can be implicit, when inferred from user
preferences over the set of items, and explicit, when based on declared
social links between users. In the latter case, connections
between users can be established by requiring individuals to rate other users explicitly 
and quantify whether they agree with their item preferences, or by
asking them to import or otherwise reveal their friendship ties.

While such connections may merely be adopted as an alternative to item-based
similarity measures, recommendations also have a powerful social aspect that
goes beyond user similarity.  In fact, individuals turn to their friends for
recommendations in their daily lives, seeking advice about products or content
items.  Furthermore, users' perception of recommendation is strongly
influenced by social aspects; experiments have shown that users overwhelmingly
favour recommendations from familiar rather than from similar
individuals~\cite{BSH07:devil}. Hence, in order to improve recommender systems
and to provide more personalised recommendation results, researchers have
proposed to incorporate the large amount of social network information
available on online services into recommendation engines~\cite{MZL11:social}.
These first attempts demonstrate that social ties might greatly extend the scope
and improve the performance of recommendation systems.

\impl  A large fraction of information tends to have a spatial locality of interest,
such as news, politics, sports, shopping items and restaurants.  Moreover, as the
focus of recommender systems shifts from online content to physical entities
such as places, the spatial factors shaping social user behaviour are bound to
become increasingly more relevant.  

\subsection{Social-inspired system design}
\label{subsec:system_design}
A natural way of exploiting the properties of the social connections between
members of an online service is to guide and suggest design choices regarding
its supporting infrastructure and its software implementation. Given the sheer size of their
user audiences and their planetary scope, successful online social services face
the problem of replicating as much of their data as possible across the globe,
in order to distribute the load on their infrastructure and reduce service
latency.  Numerous features offered by such services correspond to
a many-to-many paradigm, with users simultaneously producing and consuming
content through their social connections. This creates many complex
inter-dependencies between data items, complicating the design of the service
infrastructure and of the distributed storage architecture.

However, by exploiting the complex structure of the social network
formed by users' connections, design choices can be made to optimise the
services offered to users. Wittie et al. showed that Facebook interactions
take place mainly between friends in the same geographic
region~\cite{WPD10:locality}; hence, they propose to ease the load experienced
by a centralised server infrastructure by introducing distributed regional
proxies, reducing latency experienced by users at the same time. The inherent
community structure arising from social connections has been exploited to
partition service users into communities and, thus, optimise data distribution 
across storage locations, as done in a company's email
network~\cite{KGN10:hermes} and on Twitter~\cite{PES10:engines}. 

Since social connections between users drive content consumption on online
services, other attempts have tried to improve how such content is delivered
and served. Silberstein et al.~\cite{STC10:frenzy} sought to optimise personalised
news feeds by differentiating users based on
their content production rate. By  querying in real-time content produced by
high-rate users and caching content created by low-rate users, they reduce
both latency and system load. The viral diffusion of content across online
social connections has been exploited to rank popularity of content
items dynamically based on the fraction of social-generated requests they
experience, and to optimise content storage across disks in order to reduce energy
consumption~\cite{SC10:spinthrift}.

Finally, by observing network properties of individual users it becomes
possible to find outliers that substantially deviate from expected behaviour;
these individuals could be malicious users or could signal that a
particular user account has been compromised. For instance, a security mechanism
devised by Backstrom et al.~\cite{BSM10:findme} is based on a probabilistic
framework to predict the geographic whereabouts of Facebook users by
inspecting where their friends are located; when a user logs in from an
unexpected location, additional security mechanisms could be set in place to
prevent fraudulent activity.

\impl The geographic properties of online social services have a huge potential
to inspire the design of systems and infrastructure; in fact, some research
attempts have already explored ways of exploiting some characteristics such as
spatial locality of access patterns. However, a better understanding of the spatial
characteristics of dynamic social processes would greatly expand the scope of
social-inspired system design.

\section{Present dissertation and future outlook}
\label{sec:diss_outlook}
This chapter has reviewed the different types of online social services
available to users and the most important characteristics of the social networks
arising among their members, introducing also the new generation of location-based
social services and their particular characteristics. We then discussed 
systems exploiting the characteristics of online social services.

Since social ties are constrained by spatial distance, our 
discussion mentioned several times its effect on all the main
characteristics usually exhibited by online social networks.  
Furthermore, the emergence of the mobile Web makes location-based services a pervasive
reality, allowing the introduction of a novel entity in the online realm:
the place. This innovation enables users to interact not only among
themselves, but also with spatial entities, revealing their whereabouts
and their usual geographic movements. In summary,  the spatial
properties of online social services are increasingly more accessible and, at
the same time, more important to understand their structure fully and to 
design effective systems and applications.

This dissertation is a step in this direction. Given the impact that geographic
space has on online social ties, as discussed in
Section~\ref{sec:geography}, we plan to study and understand the relationship
between social and spatial factors.
%and how they might jointly influence the
%social connections among users. 
To this end, we present a comparative study
of the spatial properties of different online social services and we discuss new
measures that can be used to capture social and geographic
characteristics of online users simultaneously (Chapter~\ref{ch:structure}). These findings are
then extended to define a temporal model of network growth that reproduces
social and spatial patterns seen in real data (Chapter~\ref{ch:model}).

We also present two practical applications that stem from the availability of
spatial data about online social networks: a link prediction engine based on the
places visited by users (Chapter~\ref{ch:prediction}), and a family  of caching
policies for distributed content delivery networks that exploit content
diffusion over the social graph to discover geographic patterns of item popularity
(Chapter~\ref{ch:caching}).

Given the results and findings about online social networks discussed
in this chapter, many research directions that consider space and geography
could be further explored. We will consider them at the end of this dissertation
(Chapter~\ref{ch:conclusion}).
